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Agrobacterium tumefaciens P4 is a quorum-sensing-signal-producing bacterium that has been isolated from the tobacco rhizo- 
sphere. This strain belongs to genomospecies 1 of the A. tumefaciens complex; it is avirulent on various putative host plants, de- 
void of the Ti plasmid, and contains a luxl homolog on the At plasmid. 
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Among the cultured community collected from a tobacco rhi- 
zosphere, Agrobacterium tumefaciens strain P4 has been iden- 
tified as an isolate that produces quorum-sensing (QS) signals of 
the JV-acyl homoserine lactone (AHL) class (1). However, this 
strain is avirulent on different hosts (Datura stramonium and to- 
mato plants) and defective for the plasmid Ti and, hence, for the 
tral gene that encodes the synthesis of the QS signal 3-oxo- 
octanoyl-homoserine lactone (30C8-HSL) in agrobacteria. Using 
thin-layer chromatography, commercial 30C8-HSL as a refer- 
ence, and A. tumefaciens NT1 (pZLR4) as an AHL biosensor strain 
(2), we confirmed that strain P4 produces a large amount of a 
molecule that is not 30C8-HSL but indeed activates Agrobacte- 
rium QS-regulated tra genes. The exact structure of this molecule 
is currently being investigated. 

Here, we report the de novo genome assembly of A. tumefaciens 
strain P4. Two libraries were constructed using the TruSeq SBS 
version 3 sequencing kit: a shotgun (SG) paired-end library with a 
fragment size between 150 and 500 bp and a long jumping dis- 
tance (LJD) mate-pair library with an average insert size of 
7,765 bp. The two libraries were sequenced using a 2 X 100 bp 
paired-end read module of Illumina HiSeq 2000 by Eurofins 
Genomics (France). Sequences reads with low quality (<0.05), 
ambiguous nucleotides (n > 1), and a sequence length of <50 
nucleotides were discarded prior to assembly. After trimming, we 
retained 49,531,690 paired-end reads (4,695,604,212 bases) with 
an average length of 94.8 bp and 3,283,394 mate-paired reads 
(271,536,684 bases) with an average length of 82.7 bp. Sequence 
assembly was carried out using the CLC Genomics Workbench 
version 5.5 (CLC bio, Aarhus, Denmark), with a read length of 0.5 
and a similarity of 0.8. Eighteen contigs were obtained with a 
length ranging from 2.4 kbp to 884 kbp, with an N 50 value of 
532,842 bp. The scaffolding was processed using SSPACE basic 
version 2.0 (3). The in silico finishing of some gaps was carried out 
by mapping (read length of 0.9 and similarity of 0.95) the mate- 
pair reads on each of the 5 -kbp contig ends. Next, the collected 
reads were used for de novo local assembling (read length of 0.5 


and similarity of 0.8) . The published sequence is composed of nine 
contigs (from 53.8 kbp to 1.63 Mbp) grouped in 3 scaffolds, with 
a coverage rate ranging from 853- to 965-fold. 

The A. tumefaciens strain P4 genome consists of one circular 
chromosome containing 2,856,286 bp, one linear chromosome 
containing 2,052,829 bp, and one circular At plasmid containing 
661,825 bp. The G + C contents are 58.8%, 58.6%, and 56.7% for 
the circular chromosome, linear chromosome, and At plasmid, 
respectively. A total of 5,379 putative coding sequences were pre- 
dicted using the Rapid Annotations using Subsystems Technology 
(RAST) version 4.0 automated pipeline (4). A survey of the P4 
genome revealed the presence of a luxl homolog on the At plas- 
mid, the function of which remains to be investigated. 

Nucleotide sequence accession numbers. The A. tumefaciens 
P4 genome sequence has been deposited at DDBJ/EMBL/ 
GenBank under the accession no. APJV00000000. The version 
described in this paper is the first version, APJV0 1000000. 
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